62 research outputs found
Artificial Neural Network Pruning to Extract Knowledge
Artificial Neural Networks (NN) are widely used for solving complex problems
from medical diagnostics to face recognition. Despite notable successes, the
main disadvantages of NN are also well known: the risk of overfitting, lack of
explainability (inability to extract algorithms from trained NN), and high
consumption of computing resources. Determining the appropriate specific NN
structure for each problem can help overcome these difficulties: Too poor NN
cannot be successfully trained, but too rich NN gives unexplainable results and
may have a high chance of overfitting. Reducing precision of NN parameters
simplifies the implementation of these NN, saves computing resources, and makes
the NN skills more transparent. This paper lists the basic NN simplification
problems and controlled pruning procedures to solve these problems. All the
described pruning procedures can be implemented in one framework. The developed
procedures, in particular, find the optimal structure of NN for each task,
measure the influence of each input signal and NN parameter, and provide a
detailed verbal description of the algorithms and skills of NN. The described
methods are illustrated by a simple example: the generation of explicit
algorithms for predicting the results of the US presidential election.Comment: IJCNN 202
Data complexity measured by principal graphs
How to measure the complexity of a finite set of vectors embedded in a
multidimensional space? This is a non-trivial question which can be approached
in many different ways. Here we suggest a set of data complexity measures using
universal approximators, principal cubic complexes. Principal cubic complexes
generalise the notion of principal manifolds for datasets with non-trivial
topologies. The type of the principal cubic complex is determined by its
dimension and a grammar of elementary graph transformations. The simplest
grammar produces principal trees.
We introduce three natural types of data complexity: 1) geometric (deviation
of the data's approximator from some "idealized" configuration, such as
deviation from harmonicity); 2) structural (how many elements of a principal
graph are needed to approximate the data), and 3) construction complexity (how
many applications of elementary graph transformations are needed to construct
the principal object starting from the simplest one).
We compute these measures for several simulated and real-life data
distributions and show them in the "accuracy-complexity" plots, helping to
optimize the accuracy/complexity ratio. We discuss various issues connected
with measuring data complexity. Software for computing data complexity measures
from principal cubic complexes is provided as well.Comment: Computers and Mathematics with Applications, in pres
Fractional norms and quasinorms do not help to overcome the curse of dimensionality
The curse of dimensionality causes the well-known and widely discussed
problems for machine learning methods. There is a hypothesis that using of the
Manhattan distance and even fractional quasinorms lp (for p less than 1) can
help to overcome the curse of dimensionality in classification problems. In
this study, we systematically test this hypothesis. We confirm that fractional
quasinorms have a greater relative contrast or coefficient of variation than
the Euclidean norm l2, but we also demonstrate that the distance concentration
shows qualitatively the same behaviour for all tested norms and quasinorms and
the difference between them decays as dimension tends to infinity. Estimation
of classification quality for kNN based on different norms and quasinorms shows
that a greater relative contrast does not mean better classifier performance
and the worst performance for different databases was shown by different norms
(quasinorms). A systematic comparison shows that the difference of the
performance of kNN based on lp for p=2, 1, and 0.5 is statistically
insignificant
Long and short range multi-locus QTL interactions in a complex trait of yeast
We analyse interactions of Quantitative Trait Loci (QTL) in heat selected
yeast by comparing them to an unselected pool of random individuals. Here we
re-examine data on individual F12 progeny selected for heat tolerance, which
have been genotyped at 25 locations identified by sequencing a selected pool
[Parts, L., Cubillos, F. A., Warringer, J., Jain, K., Salinas, F., Bumpstead,
S. J., Molin, M., Zia, A., Simpson, J. T., Quail, M. A., Moses, A., Louis, E.
J., Durbin, R., and Liti, G. (2011). Genome research, 21(7), 1131-1138]. 960
individuals were genotyped at these locations and multi-locus genotype
frequencies were compared to 172 sequenced individuals from the original
unselected pool (a control group). Various non-random associations were found
across the genome, both within chromosomes and between chromosomes. Some of the
non-random associations are likely due to retention of linkage disequilibrium
in the F12 population, however many, including the inter-chromosomal
interactions, must be due to genetic interactions in heat tolerance. One region
of particular interest involves 3 linked loci on chromosome IV where the
central variant responsible for heat tolerance is antagonistic, coming from the
heat sensitive parent and the flanking ones are from the more heat tolerant
parent. The 3-locus haplotypes in the selected individuals represent a highly
biased sample of the population haplotypes with rare double recombinants in
high frequency. These were missed in the original analysis and would never be
seen without the multigenerational approach. We show that a statistical
analysis of entropy and information gain in genotypes of a selected population
can reveal further interactions than previously seen. Importantly this must be
done in comparison to the unselected population's genotypes to account for
inherent biases in the original population
Quasi-orthogonality and intrinsic dimensions as measures of learning and generalisation
Finding best architectures of learning machines, such as deep neural
networks, is a well-known technical and theoretical challenge. Recent work by
Mellor et al (2021) showed that there may exist correlations between the
accuracies of trained networks and the values of some easily computable
measures defined on randomly initialised networks which may enable to search
tens of thousands of neural architectures without training. Mellor et al used
the Hamming distance evaluated over all ReLU neurons as such a measure.
Motivated by these findings, in our work, we ask the question of the existence
of other and perhaps more principled measures which could be used as
determinants of success of a given neural architecture. In particular, we
examine, if the dimensionality and quasi-orthogonality of neural networks'
feature space could be correlated with the network's performance after
training. We showed, using the setup as in Mellor et al, that dimensionality
and quasi-orthogonality may jointly serve as network's performance
discriminants. In addition to offering new opportunities to accelerate neural
architecture search, our findings suggest important relationships between the
networks' final performance and properties of their randomly initialised
feature spaces: data dimension and quasi-orthogonality
Agile gesture recognition for capacitive sensing devices: adapting on-the-job
Automated hand gesture recognition has been a focus of the AI community for decades. Traditionally, work in this domain revolved largely around scenarios assuming the availability of the flow of images of the user hands. This has partly been due to the prevalence of camera-based devices and the wide availability of image data. However, there is growing demand for gesture recognition technology that can be implemented on low-power devices using limited sensor data instead of high-dimensional inputs like hand images. In this work, we demonstrate a hand gesture recognition system and method that uses signals from capacitive sensors embedded into the etee hand controller. The controller generates real-time signals from each of the wearer five fingers. We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms. The analysis is composed of a two stage training strategy, including dimension reduction through principal component analysis and classification with K nearest neighbour. Remarkably, we found that this combination showed a level of performance which was comparable to more advanced methods such as supervised variational autoencoder. The base system can also be equipped with the capability to learn from occasional errors by providing it with an additional adaptive error correction mechanism. The results showed that the error corrector improve the classification performance in the base system without compromising its performance. The system requires no more than 1 ms of computing time per input sample, and is smaller than deep neural networks, demonstrating the feasibility of agile gesture recognition systems based on this technology.Depto. de Análisis Matemático y Matemática AplicadaFac. de Ciencias MatemáticasFALSEunpu
Robust And Scalable Learning Of Complex Dataset Topologies Via Elpigraph
Large datasets represented by multidimensional data point clouds often
possess non-trivial distributions with branching trajectories and excluded
regions, with the recent single-cell transcriptomic studies of developing
embryo being notable examples. Reducing the complexity and producing compact
and interpretable representations of such data remains a challenging task. Most
of the existing computational methods are based on exploring the local data
point neighbourhood relations, a step that can perform poorly in the case of
multidimensional and noisy data. Here we present ElPiGraph, a scalable and
robust method for approximation of datasets with complex structures which does
not require computing the complete data distance matrix or the data point
neighbourhood graph. This method is able to withstand high levels of noise and
is capable of approximating complex topologies via principal graph ensembles
that can be combined into a consensus principal graph. ElPiGraph deals
efficiently with large and complex datasets in various fields from biology,
where it can be used to infer gene dynamics from single-cell RNA-Seq, to
astronomy, where it can be used to explore complex structures in the
distribution of galaxies.Comment: 32 pages, 14 figure
Personality Traits and Drug Consumption. A Story Told by Data
This is a preprint version of the first book from the series: "Stories told
by data". In this book a story is told about the psychological traits
associated with drug consumption. The book includes:
- A review of published works on the psychological profiles of drug users.
- Analysis of a new original database with information on 1885 respondents
and usage of 18 drugs. (Database is available online.)
- An introductory description of the data mining and machine learning methods
used for the analysis of this dataset.
- The demonstration that the personality traits (five factor model,
impulsivity, and sensation seeking), together with simple demographic data,
give the possibility of predicting the risk of consumption of individual drugs
with sensitivity and specificity above 70% for most drugs.
- The analysis of correlations of use of different substances and the
description of the groups of drugs with correlated use (correlation pleiades).
- Proof of significant differences of personality profiles for users of
different drugs. This is explicitly proved for benzodiazepines, ecstasy, and
heroin.
- Tables of personality profiles for users and non-users of 18 substances.
The book is aimed at advanced undergraduates or first-year PhD students, as
well as researchers and practitioners. No previous knowledge of machine
learning, advanced data mining concepts or modern psychology of personality is
assumed. For more detailed introduction into statistical methods we recommend
several undergraduate textbooks. Familiarity with basic statistics and some
experience in the use of probabilities would be helpful as well as some basic
technical understanding of psychology.Comment: A preprint version prepared by the authors before the Springer
editorial work. 124 pages, 27 figures, 63 tables, bibl. 24
- …